METROPOLIS-HASTINGS ALGORITHMS FOR 
PERTURBATIONS OF GAUSSIAN MEASURES IN HIGH 
DIMENSIONS: CONTRACTION PROPERTIES AND ERROR 
BOUNDS IN THE LOGCONCAVE CASE 

ANDREAS EBERLE 

■ Abstract. The Metropolis-adjusted Langevin algorithm (MALA) is a Metro- 
polis-Hastings method for approximate sampling from continuous distributions. 

. We derive upper bounds for the contraction rate in Kantorovich-Rubinstein- 

Wasserstein distance of the MALA chain with semi-implicit Euler proposals 

■ applied to log-concave probability measures that have a density w.r.t. a Gauss- 
I ian reference measure. For sufhciently "regular" densities, the estimates are 

dimension-independent, and they hold for sufficiently small step sizes h that do 
I not depend on the dimension either. In the limit /i J, 0, the bounds approach 

the known optimal contraction rates for overdamped Langevin diffusions in a 
I convex potential. 

p I ■ A similar approach also applies to Metropolis-Hastings chains with Ornstcin- 

Uhlenbeck proposals. In that case, the resulting estimates are still independent 
of the dimension but less optimal, reflecting the fact that MALA is a higher order 
approximation of the diffusion limit than Metropolis-Hastings with Ornstcin- 
Uhlenbcck proposals. 



O 

oo 



O 



1. Introduction 



The performance of Metropolis-Hastings (MH) methods [23l [16], [27] for samphng 
probabihty measures on high dimensional continuous state spaces has attracted 
growing attention in recent years. The pioneering works by Roberts, Gelman and 
■ Gilks [2B] and Roberts and Rosenthal [221 show in particular that for product 

measures n'^ on M°', the average acceptance probabilities for the Random Walk 
Metropolis Algorithm (RWM) and the Metropolis adjusted Langevin algorithm 
>< ; (MALA) converge to a strictly positive limit as (i oo only if the step sizes 

h go to zero of order 0{d~^), 0{d~^^^) respectively. In that case, a diffusion 
limit as d — > oo has been derived, leading to an optimal scaling of the step sizes 
maximizing the speed of the limiting diffusion, and an asymptotically optimal 
acceptance probability. 

Recently, the optimal scaling results for RWM and MALA have been extended 
significantly to targets that are not of product form but have a sufficiently regular 
density w.r.t. a Gaussian measure, cf. [22] 1^ . On the other hand, it has been 
pointed out [31 [H [Hj, [8] that for corresponding perturbations of Gaussian mea- 
sures, the acceptance probability has a strictly positive limit as d — oo for small 
step sizes that do not depend on the dimension, provided the Random Walk or 
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Euler proposals in RWM and MALA are replaced by Ornstein-Uhlenbeck or semi- 
implicit ("preconditioned") Euler proposals respectively, cf. also below. Pillai, 
Stuart and Thiery [25] show that in this case, the Metropolis-Hastings algorithm 
can be realized directly on an infinite dimensional Hilbert space arising in the limit 
as d — )■ oo, and the corresponding Markov chain converges weakly to an infinite 
dimensional overdamped Langevin diffusion as J, 0. 

Mixing properties and convergence to equilibrium of Langevin diffusions have 
been studied intensively [151 EH E El El [31]. In particular, it is well-known that 
contractivity and exponential convergence to equilibrium in Wasserstein distance 
can be quantified if the stationary distribution is strictly log-concave [3 135] , cf . also 
[TT] for a recent extension to the non-log-concave case. Because of the diffusion 
limit results, one might expect that the approximating Metropolis-Hastings chains 
have similar convergence properties. However, this heuristics may also be wrong, 
since the convergence of the Markov chains to the diffusion is known only in a 
weak and non-quantitative sense. 

Although there is a huge number of results quantifying the speed of conver- 
gence to equilibrium for Markov chains on discrete state spaces (cf. [HI |32] for 
an overview), there are relatively few quantitative results on Metroplis-Hastings 
chains on M*^ when d is large. The most remarkable exception are the well-known 
works [TOl [T71 [191 1201 [21] which prove an upper bound for the mixing time that 
is polynomial in the dimension for Metropolis chains with ball walk proposals for 
uniform measures on convex sets and more general log-concave measures. 

Below, we develop an approach to quantify Wasserstein contractivity and conver- 
gence to equilibrium in a dimension-independent way for the Metropolis-Hastings 
chains with Ornstein-Uhlenbeck and semi-implicit Euler proposals. Our approach 
applies in the strictly log-concave case (or, more generally, if the measure is strictly 
log-concave on an appropriate ball) and yields bounds for small step sizes that are 
very precise. The results for semi-implicit Euler proposals require less restrictive 
assumptions than those for Ornstein-Uhlenbeck proposals, reflecting the fact that 
the corresponding Markov chain is a higher order approximation of the diffusion. 

Our results are closely related and complementary to the recent work [13], and 
to the dimension- dependent geometric ergodicity results in [5]. In particular, in 
[13] . M. Hairer, A. Stuart and S. Vollmer apply related methods to establish expo- 
nential convergence to equilibrium in Wasserstein distance for Metropolis-Hastings 
chains with Ornstein-Uhlenbeck proposals in a less quantitative way, but without 
assuming log-concavity. In the context of probability measures on function spaces, 
the techniques developed here are applied in the PhD Thesis [12] of D. Gruhlke. I 
would like to thank in particular Daniel Gruhlke and Sebastian Vollmer for many 
fruitful discussions related to the contents of this paper. 

We now recall some basic facts on Metropolis-Hastings algorithms, and de- 
scribe our setup and the main results. Sections [2] and [3] contain basic results 
on Wasserstein contractivity of Metropolis-Hastings kernels, and contractivity of 
the proposal kernels. In Sections [4] and [5l we prove bounds quantifying rejection 
probabilities and the dependence of the rejection event on the current state for 
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Ornstein-Uhlenbeck and semi-implicit Euler proposals. These bounds, combined 
with an upper bound for the exit probability of the corresponding Metropolis- 
Hastings chains from a given ball derived in Section |6] are crucial for the proof of 
the main results in Section [71 

1.1. Metropolis-Hastings algorithms. Let U : M*^ — M be a lower bounded 
measurable function such that 

Z = I exp{—U{x))dx < oo, 

and let fi denote the probability measure on M'^ with density proportional to 
exp(— f/). We use the same letter /i for the measure and its density, i.e., 

(1.1) ii[dx) = fi{x) dx = exp{—U{x)) dx. 

Below, we view the measure /i defined by f 1 1.1 1) as a perturbation of the standard 
normal distribution in M"^, i.e., we decompose 

(1.2) U{x) = ^\x\^ + V{x), xgR^ 

with a measurable function V : M"' — )■ M, and obtain the representation 

(1.3) n{dx) = Z-^exp{-V{x))-f'^{dx) 

with normalization constant Z = Zji^TxY^'^- Here | ■ | denotes the Euclidean 
norm. 

Note that in M*^, any probability measure with a strictly positive density can 
be represented as an absolutely continuous perturbation of 7*^ as in fll.3p . In an 
infinite dimensional limit, however, the density may degenerate. Nevertheless, also 
on infinite dimensional spaces, absolutely continuous perturbations of Gaussian 
measures form an important and widely used class of models. 

Example 1.1 (Transition Path Sampling). We briefly describe a typical ap- 
plication, cf. [13] and [12] for details. Suppose that we are interested in sampling 
a trajectory of a diffusion process in conditioned to a given end-point h at 
time t=\. We assume that the unconditioned diffusion process (yi,P) satisfies a 
stochastic differential equation of the form 

(1.4) dYt = -VH{Yt)dt + dBt 

where {Bt) is an ^-dimensional Brownian motion, and H G C^(M^) is bounded 
from below. Then, by Girsanov's Theorem and Ito' s formula, a regular version of 
the law of the conditioned process satisfying Yq = a and Yi = 6 on the path space 
E = {ye C([0, 1], ■.yo = a,yi = h} is given by 

(1.5) ^i{dy) = C~'exp{-Viy))^idy), 
where 7 is the law of the Brownian bridge from a to b, 

1 

(1.6) V{y) = - (P{ys)ds with 0(a;) = \WH{x)\^ - AH{x), 

^ Jo 
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and C = exp {H{b) — H{a)), cf. [31]. In order to obtain finite dimensional approx- 
imations of the measure /i on E, we consider tlie Wiener-Levy expansion 



oo 2"-l 



Vt = et + ^^^Xn,k,iet''''\ te[0,l], 



n=0 k=0 i=l 



n,k,i 
t 



of a path y G -E in terms of the basis functions Ct = {1 — t)a + tb and e 
2~"/^(7(2'"t — k)e^ with g{s) = min(s, 1 — s)~^. Here the coefficients Xn,k,i, n > 0, 
0<k<2"',l<i<i, are real numbers. Recall that truncating the series 
at n = m — 1 corresponds to taking the polygonal interpolation of the path y 
adapted to the dyadic partition = {k2~"^ : k = 0, 1, . . . , 2"^} of the interval 
[0, 1]. Now fix m e N, let d = {T" - 1)£, and let 

x'^ = {xn,k,i : < n < m, < A; < 2", 1 < i < /) G R'^ 

denote the vector consisting of the first d components in the basis expansion of 
a path y ^ E. Then the image of the Brownian bridge measure 7 under the 
projection tt^ : — )■ M'^ that maps y to x'^ is the (i-dimensional standard normal 
distribution 7^^, cf. e.g. [33]. Therefore, a natural finite dimensional approximation 
to the infinite dimensional sampling problem described above consists in sampling 
from the probability measure 

(1.8) fididx) = Z^'exp{-Vd{x))Yidx) 

on where Zd is a normalization constant, and 



1.9) Vd{x) = 2"-"i -0(yo) + Yl ^(y 



'^'"'k2~ 



k=l 




with y = e + X]n<m ^k Si Xn,k,i^^^ denoting the polygonal path corresponding 

to = {Xn,k,i) e K''- 

Returning to our general setup, suppose that p{x, dy) = p{x, y) dy is an abso- 
lutely continuous transition kernel on with strictly positive densities p{x,y). 
Let 

(1.10) a{x, y) = min [j^^^^y l) , ^.V ^ 

Note that a{x,y) does not depend on Z. The Metropolis-Hastings algorithm 
with proposal kernel p is the following Markov chain Monte Carlo method for 
approximate sampling and Monte-Carlo integration w.r.t. /i: 

(1) Choose an initial state Xq. 

(2) For n := 0,1,2,... do 

• Sample Yn ~ p{Xn, dy) and Un ~ Unif(0, 1) independently. 

• If [/„ < a{Xn, Yn) then accept the proposal and set Xn+i := Yn, 
else reject the proposal and set Xn+i := Xn- 
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The algorithm generates a time-homogeneous Markov chain (X„)„=o.i,2,... with ini- 
tial state Xq and transition kernel 

(1.11) q{x,dy) = a{x,y)p{x,y)dy + r{x)-6a:{dy). 
Here 

(1.12) r{x) = 1- q{x,M'^\{x}) = 1- / a{x,y)p{x,y)dy 

is the average rejection probability for the proposal when the Markov chain is at 
X. Note that q{x, dy) restricted to \ {x} is again absolutely continuous with 
density 

q{x,y) = a{x,y)p{x,y). 

Since 

H{x)q{x,y) = a{x,y)n{x)p{x,y) = mm{n{y)p{y, x), n{x)p{x,y)) 

is a symmetric function in x and y, the kernel q{x, dy) satisfies the detailed balance 
condition 

(1.13) fi{dx)q{x,dy) = fi{dy)q{y, dx). 

In particular, /i is a stationary distribution for the Metropolis-Hastings chain, and 
the chain with initial distribution /i is reversible. Therefore, under appropriate 
ergodicity assumptions, the distribution of X„ will converge to /i as n — )■ oo. 

To analyze Metropolis-Hastings algorithms it is convenient to introduce the func- 
tion 

(1.14) G{x,y) = log = U{y) -U{x) + log— -. 

H{y)p{y,x) p{y,x) 

For any x, y G M'^, 

(1.15) aix,y) = exp(-G(a;,y)+). 
In particular, for any x, y,x,y & M'^, 

(1.16) l-a{x,y) < G{x,y)+, 

(1.17) {a{x,y) - a{x,y)y < {G{x,y) - G{x,y)y , and 

(1.18) {a{x,y)~a{x,y))- < {G{x,y) - G(x,y)y ■ 

The function G{x, y) defined by (11.141) can also be represented in terms of V: 
Indeed, since 

we have 

(1-19) G(x, y) = V{y) - V{x) + log ^yi^^"''^] 

where 7'^(x) = (27r)~'^/^ exp(— |xp/2) denotes the standard normal density in M'^. 
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1.2. Metropolis-Hastings algorithms with Gaussian proposals. We aim at 

proving contractivity of Metropolis-Hastings kernels w.r.t. appropriate Kantorovich- 
Rubinstein-Wasserstein distances. For this purpose, we are looking for a proposal 
kernel that has adequate contractivity properties and sufficiently small rejection 
probabilities. The rejection probability is small if the proposal kernel approxi- 
mately satisfies the detailed balance condition w.r.t. /i. 

1.2.1. Ornstein-Uhlenbeck proposals. A straightforward approach would be to use 
a proposal density that satisfies the detailed balance condition 

(1.20) 7'^(x)p(x, y) = l'^{y)p{y,x) for any x, y e M"* 
w.r.t. the standard normal distribution. In this case, 

(1.21) G(x,y) = V{y)-V{x). 

The simplest form of proposal distributions satisfying fll.20p are the transition 
kernels of AR(1) (discrete Ornstein-Uhlenbeck) processes given by 

(1.22) = ^ ((l - ^) ^' t) 

for some constant h G (0,2). If Z is a standard normally distributed M*^- valued 
random variable then the random variables 

(1.23) yr(^) := (l - ^) ^ + ^ 

have distributions p^^{x,dy). Note that by (ll.2ip . the acceptance probabilities 

(1.24) a°"(x,y) = exp(-GOU(x, y)+) = exp {-{V{y) - V{x)y) 
for Ornstein-Uhlenbeck proposals do not depend on h. 

1.2.2. Euler proposals. In continuous time, under appropriate regularity and growth 
conditions on V, detailed balance w.r.t. yU is satsfied exactly by the transition 
functions of the diffusion process solving the over-damped Langevin stochastic 
differential equation 

(1.25) dXt = -hit dt - \VV{X,) dt + dBt , 
because the generator 

if = Ia-Ix-V-^VI^-V = 1(A-Vf/-V) 

is a self-adjoint operator on an appropriate dense subspace of L^(]R'^;/i), cf. [3T] . 
Although we can not compute and sample from the transition functions exactly, 
we can use approximations as proposals in a Metropolis-Hastings algorithm. A 
corresponding MH algorithm where the proposals are obtained from a discretiza- 
tion scheme for the SDE fll.25p is called a Metropolis- adjusted Langevin Algorithm 
(MALA), cf. [301 EH- 
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In this paper, we focus on the MALA scheme with proposal kernel 
:i.26) ph{x, .) = n((i-^]x- ^VVix), (h-^ 



for some constant h G (0, 2), i.e., ■) is the distribution of 



Ykix) = x-^VU{x) + Jh-^Z 



(1.27) = (^l-^y-^VV{x) + ^h-^Z, 

where Z ~ is a standard normal random variable with values in M.'^. 

Note that if h — h'^/A is replaced by h, then f ll.27p is a standard Euler dis- 
cretization step for the SDE fll.25p . Replacing hhj h — ensures that detailed 
balance is satisfied exactly for V ^ 0. Alternatively, fll.27p can be viewed as a 
semi-implicit Euler discretization step for f ll.25p : 

Remark 1.2 (Euler schemes). The explicit Euler discretization of the over- 
damped Langevin equation fll.25p with time step size h > is given by 

(1.28) Xn+i = (1 - ^) - ^V\/(X„) + VhZn+i, n = 0, 1, 2, . . . , 



where Zn,n G N, are i.i.d. random variables with distribution 7"^. The process 
{Xn) defined by f ll.28p is a time-homogeneous Markov chain with transition kernel 

(1.29) pfr'-'ix,-) = N((l-^)x-^VVix),h.h 



2 J 2 

Even for V = 0, the measure /i is not a stationary distribution for the kernel pf^^*^^- 
A semi-implicit Euler scheme for fll.25p with time-step size e > is given by 

(1.30) X„+i - X„ = -| ■ ^!^±1±^ _ |vy(X„) + v/FZ„+i 



with Zn i.i.d. with distribution 7^^, cf. [T3]. Note that the scheme is implicit only 
in the linear part of the drift but explicit in V^. Solving for Xn+i in fll.30p and 
substituting h = e/{l + ^) with h G (0,2) yields the equivalent equation 



(1.31) = (^1-^^X^- ^VV{X^) + \jh-^ 

We call the Metropolis-Hastings algorithm with proposal kernel Ph,(x, ■) a semi- 
implicit MALA scheme with step size h. 

Proposition 1.3 (Acceptance probabilities for semi-implicit MALA). Let 

V G C^(]R'^) and h G (0,2). Then the acceptance probabilities for the Metropolis- 
adjusted Langevin algorithm with proposal kernels ph are given by ah{x,y) = 
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exp{—Gh{x,y)~^) with 

Gh{x, y) = V{y) - V{x) - ^ ■ {yV{y) + VV{x)) 

(1-32) [{y + x) ■ {VV{y) - VV{x)) + \VV{y)\' - \VV{x)\^] . 

For explicit Euler proposals with step size h > 0, a corresponding representation 
holds with 

G^^^'^ix,y) = Viy)-V{x)-^-iVV{y) + VV{x)) 

(1.33) +^[\y + ^viy)\'-\x + VV{x)\']. 

The proof of the proposition is given in Section H] below. 

Remark 1.4. For exphcit Euler proposals, the 0{h) correction term in fll.33p does 
not vanish for = 0. More significantly, this term goes to infinity as |?/ — x| — oo, 
and the variance oi y — x w.r.t. the proposal distribution is of order 0{d). 

1.3. Bounds for rejection probablities. We fix a norm || • ||_ on such that 

(1.34) < |x| for any X G M"'. 

We assume that V is sufficiently smooth w.r.t. || • ||_ with derivatives growing at 
most polynomially: 

Assumption 1.5. The function V is in C^(M'^), and for any n G {1, 2, 3, 4}, there 
exist finite constants C„ G [0, oo), p„ G {0, 1,2,.. .} such that 

mu...,^V){x)\ < amax(l,||x||_)^"||^i||_ 

holds for any x G M'^ and ^i, . . . , e M''. 

For discretizations of infinite dimensional models, || ■ ||_ will typically be a finite 
dimensional approximation of a norm that is almost surely finite w.r.t. the limit 
measure in infinite dimensions. 

Example 1.6 (Transition Path Sampling). Consider the situation of Example 
Oand assume that H is in ^^(M^). Then by (USD and Va is C^. Forn < 4 

and X, ^1, . . . , ^„ G M'^, the directional derivatives of Vd are given by 

2™ 

(1.35) dg...^Vd{x) = 2-"^-! Yl ^X?/fe2-) [m,fc2--, . . . , yin,k2-A 

where y,r]i, . . . ,r]n are the polygonal paths in E corresponding to x, 
respectively, Wk = 1 for = 1,...,2™ — 1, and Wq = Wi = 1/2. Assuming 
||Z)^0(2;)|| = 0(|2;|'') for some integer r > as |z| — >■ oo, we can estimate 

\^Z-<u^dix)\ < Cn max(l, ||y||L0^"ll^l|U9 ll'7n||L9 

where g = r + 4, p„ = r + (4 — n), \\y\\Li = 2~™ X]fc=o ""^fcl^/'^l'' ^ discrete L'^ 
norm of the polygonal path y, and Ci,...,C4 are finite constants that do not 
depend on the dimension d. One could now choose for the minus norm the norm 
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on Mf^ corresponding to the discrete L'^ norm on polygonal paths. However, it is 
more convenient to choose a norm coming from an inner product. To this end, we 
consider the norms 




on M.'^ where d = (2"^ — One can show that for a < 1/2 + 1/q, the L"^ norm can 
be bounded from above by || ■ ||a independently of the dimension, cf. [12]. On the 
other hand, if a > 1/2 then \\y\\a < oo for 7- almost every path y of the Brownian 
bridge. This property will be crucial when restricting to balls w.r.t. || ■ \\a- For 
II ■ II- = II ■ lU with a G (1/2,1/2 + 1/q), both requirements are satisfied, and 
Assumption 11.51 holds with constants that do not depend on the dimension. 

The next proposition yields in particular an upper bound for the average rejec- 
tion probability w.r.t. both Ornstein-Uhlenbeck and semi-implicit Euler proposals 
at a given position x G Mf^, cf. [6] for an analogue result: 

Proposition 1.7 (Upper bounds for MH rejection probabilities). Suppose 
that Assumption \1.5\ is satisfied and let /c G N. Then there exist polynomials 
^ : M ^ M+ and Vk : ^ R+ of degrees pi + 1, max(p3 + 3, 2p2 + 2) 
respectively, such that for any a; G M'' and h G (0, 2), 

E[{1 - a^^ix^Yj^^^ix)))''^/'' < V^^{\\x\\^) ■ h^/\ and 
E[(l-«,(x,y,(x)))Y/' < V,{\\x\\^,\m{x)\\.)-h'/'. 

The result is a consequence of Proposition 11.31 The proof is given in Section H] 
below. 

Remark 1.8. (1) The polynomials Vj?^ and Vk in Proposition 11.71 are explicit, 
cf. the proof below. They depend only on the values Cn,Pn in Assumption 11.51 for 
n = 1, n = 2, 3 respectively, and on the moments 

rrin = E[\\Z\\^], n < k ■ {pi + 1) , n < k ■ max(p3 + 3, 2p2 + 2) resp., 

but they do not depend on the dimension d. For semi-implicit Euler proposals, 
the upper bound in Proposition 11.71 is stated in explicit form for the case k = 1 
and p2 = ps = in (14. 6 p below. 

(2) For explicit Euler proposals, corresponding estimates hold with m„ replaced by 
rhn = E[\Z\"'], cf. Remark 14.31 below. Note, however, that m„ — t- 00 as (i — t- 00. 

Our next result is a bound of order 0(/i^/^), 0{K^/'^) respectively, for the average 
dependence of the acceptance event on the current state w.r.t. Ornstein-Uhlenbeck 



on path space E, and the induced norms 

1/2 



^ -2«n_2 

\n<m kA 
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and semi-implicit Euler proposals. Let || ■ ||+ denote the dual norm of || ■ ||- on 
R'^, i.e., 

11^11+ = sup{^ ■r]\r]eW^ with < 1}. 

Note that 

uw- < lei < m+ v^GM^. 

For a function F e Ci(M°'), 

• 1 



\F{y) - F{x) 



{y-x) ■VF{{l-t)x + ty)dt 



< \\y - ■ sup ||VF(2;)|| + , 

z£[x,y] 



i.e., the plus norm of VF determines the Lipschitz constant w.r.t. the minus norm. 

Proposition 1.9 (Dependence of rejection on the current state). Suppose 
that Assumption \1.5\ is satisfied and let A; G N. Then there exist polynomials 
Q^^ : M ^ R+ and : ^ M+ of degrees p2 + 1, max(p4 + 3,P3+P2 + 2, 3p2 + l) 
respectively, such that for any x, x G M'^ and h G (0, 2), 

(1.36) E[||v.Gnx,yn^))fj^/'= < Q^dl^ll-)-/^'/', 

(1.37) E[\\V.G,{x,Y,{xmlY/' < Q,(||x||_,||Vf/(a:)||_)■/^3/^ 

(1.38) E[\a''^{x,Y,^^{x))-a''^{x,Yr{x)t?^' 

< Q^^(max(||a;||_, ||x||_)) ■ ||a; - ■ /i^/^ and 

(1.39) E[|«;,(x,n(x)) -«,(J,F,(5))|Y/' 

< Qk (max(||x||_, ||x||„), sup^gj^.^^] || Vf/(-2) ||_) ■ — ■ /i^^^, 

where [x, x\ denotes the line segment between x and x. 

The proof of the proposition is given in Section [S] below. 

Remark 1.10. Again, the polynomials and Qk are explicit. They depend 
only on the values Cn,Pn in Assumption 11.51 for n = 1, 2, n = 2, 3, 4 respectively, 
and on the moments m„ = for n < k ■ {p2 + 1), n < k ■ max(p4 + 3,p3 + 

P2 + 2, 2p2 + 1) respectively, but they do not depend on the dimension d. For 
semi- implicit Euler proposals, the upper bound in Proposition [L9] is made explicit 
for the case k = 1 and P2 = = P4 = in fl5.19p below. 

For Ornstein-Uhlenbeck proposals it will be useful to state the bounds in Propo- 
sitions [LT| and [L9] more explicitly for the case p2 = 0, i.e., when the second deriva- 
tives of V are uniformly bounded w.r.t. the minus norm: 

Proposition 1.11. Suppose that Assumption \1.5\ is satisfied for n = 1,2 with 
P2 = 0. Then for any x, x G M"^ and h G (0, 2), 

E[l -a^^(x,F;f^(x))] < mi(Ci + C2||x||_) -/i^/' 

+ ^(2m2C2 + Ci||x||_ + C2||xf_) . h + ^miC2||x||„ ■ h^^\ 
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and 

< (^y^C2-h^'^ + ^(Ci + 2C2max(||a;||_,P||_))-/i^ • \\x-x\\.. 

The proof is given in Sections H] and [5] below. Again, corresponding bounds also 
hold for norms for k ^ 1,2. 

1.4. Wasserstein contractivity. The bounds in Propositions II. 7[ 11.91 and 11.111 
can be applied to study contractivity properties of Metropolis-Hastings transition 
kernels. Recall that the Kantorovich- Rubinstein or L} -Wasserstein distance of two 
probability measures /i and v on the Borel cr-algebra B{W^) w.r.t. a given metric 
d on R'^ is defined by 



>V(/i, z/) = inf / d{x, x) ■r]{dx dx) 

r|eU{^lM) J 

where n(/i, z/) consists of all couplings r/ of /i and z/, i.e., all probability measures 
?7 on M'^ X M*^ with marginals fi and u, cf. e.g. [31]. Recall that a coupling of fi and 
I' can be realized by random variables W and W defined on a joint probability 
space such that W ^ n and W ^ v. 

In order to derive upper bounds for the distances yV{^qh, i^Qh), and, more gener- 
ally, W(/iq'^, z/g^), n G N, we define a coupling of the MALA transition probabihties 
g/j(x, ■ ),x G M*^, by setting 

Wh{x) ■- l^^^^^ U < ah{x,Yh{x)), 
|x if W > ah{x,Yh{x)). 

Here Yh{x), x G M"^, is the basic coupling of the proposal distributions Ph{x, ■ ) 
defined by fll.27p with Z ~ 7*^, and the random variable U is uniformly distributed 
in (0, 1) and independent of Z. 

Correspondingly, we define a coupling of the Metropolis-Hastings transition ker- 
nels q^^ based on Ornstein-Uhlenbeck proposals by setting 



Let 



FOU(^) if U <a'^^{x,Y^'^{x)), 
X if W > a^^{x,Y^^{x)). 



B- := {x G M"^ : ||a;||_ < R} 

denote the centered ball of radius R w.r.t. || ■ ||_. As a consequence of Proposition 
11.111 above, we obtain the following upper bound for the Kantorovich- Rubinstein- 
Wasserstein distance of q^^{x, ■) and q^^{x, ■) w.r.t. the metric d{x, x) = ||a; — 

Theorem 1.12 (Contractivity of MH transitions based on OU proposals). 

Suppose that Assumption \1.5\ is satisfied for n = 1,2 with p2 = 0. Then for any 
h e (0,2), R e (0, 00), and x,x e Bj^, 

E[\\W^^{x) - W^^{x)\\^] < c^^{R) ■ \\x - x\\_, where 
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c^^{R) = 1 - + maCa h + A{1 + R){1 + h'/^R) h^^^ 
with an explicit constant A that only depends on the values mi,m2,Ci and C2. 
The proof is given in Section [7] below. 

Theorem 11.121 shows that Wasserstein contractivity holds on the ball B]^ pro- 
vided 2m2C2 < 1 and h is chosen sufficiently small depending on R (with /i^/^ = 
0{R~^)). In this case, the contraction constant c^^(-R) depends on the dimension 
only through the values of the constants Ci,C2,mi and 7712- On the other hand, 
the following one- dimensional example shows that for 777,26*2 > 1, the Acceptance- 
Rejection step may destroy the contraction properties of the OU proposals: 

Example 1.13. Suppose that d = 1 and || ■ ||- = | ■ |- If V{x) = bx'^/2 with a 
constant b G (—1/2, 1/2) then by Theorem 1 1.1 21 Wasserstein contractivity holds for 
the Metropolis Hastings chain with Ornstein Uhlenbeck proposals on the interval 
{—R, R) provided h is chosen sufficiently small. On the other hand, if V{x) = bx'^/2 
for |a;| < 1 with a constant b < —1, then the logarithmic density 

U{x) = 1/(x)+xV2 = (6 + l)-xV2 

is strictly concave for \x\ < 1, and it can be easily seen that Wasserstein con- 
tractivity on (—1, 1) does not hold for the MH chain with OU proposals if h is 
sufficiently small. 

A disadvantage of the result for Ornstein-Uhlenbeck proposals stated above 
is that not only a lower bound on the second derivative of V is required (this 
would be a fairly natural condition as the example indicates), but also an upper 
bound of the same size. For semi-implicit Euler proposals, we can derive a better 
result that requires only a strictly positive lower bound on the second derivative of 
U{x) = V{x) + \x\'^/2 and Assumption 11.51 with arbitrary constants to be satisfied. 
For this purpose we assume that 

II ■ II- = (■,-)^/^ 

for an inner product ( ■ , ■ ) on M"^, and we make the following assumption on U : 
Assumption 1.14. There exists a strictly positive constant K G (0, 1] such that 

(1.40) {r],V^U{x) ■ 7]) > K{r],r]) for any x, r/ G K . 

Of course. Assumption II. 141 is still restrictive, and it will often be satisfied only 
in a suitable ball around a local minimum of U. Most of the results below are 
stated on a given ball B]^ w.r.t. the minus norm. In that case it is enough to 
assume that 11.141 holds on that ball. If || ■ ||_ coincides with the Euclidean norm 
I ■ I then the assumption is equivalent to convexity of U{x) — i^lxp. Moreover, 
since V^t/(x) = + V'^V{x), a sufficient condition for fll.401) to hold is 

(1.41) ||vV(x) ■r/||_ < (1 -fsT) for any x, 77 G K . 

As a consequence of Propositions 11.71 and 11.91 above, we obtain the following 
upper bound for the Kantorovich- Rubinstein- Wasserstein distance of qh{x, ■ ) and 
qh{x, ■ ) w.r.t. the metric d{x,x) = \\x — x||_: 
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Theorem 1.15 (Contractivity of semi-implicit MALA transitions). 

Suppose that Assumptions I j.5l and 1.1 4 ^^^^ satisfied. Then for any h G (0,2), 
R G (0, oo), and x,x E B^, 



E[\\Wh{x) -Wh{x)\\J\ < Ch{R) -Wx -x\\-, where 
Ch{R) = l-^Kh+ Qm(/?)2 + 7(i?)^ + (^K(3{R) + ^SiR) 



with 



M{R) = sup{|| V2?7(^) ■ : G Sf, z G B-}, 
(3{R) = snp{Vi{\\z\\^,\m{z)\\^) : zeB^}, 

7(i?) = ml/'.snp{Q2{\\z\\^,\m{z)\\^) : zeB],}, 
6iR) = snp{Q2i\\z\\^,miz)\n\\WUiz)\\^ : zeB^}. 

The proof is given in Section [7] below. 

Remark 1.16. Theorem 11.151 shows in particular that under Assumptions II . 5l and 
11.141 there exist constants C,q E (0, oo) such that the contraction 

E[\\Wh{x)-Whix)m < (i-f^) 11^ -^11 

holds for X, X G Bj^ whenever > C ■ (1 + R'^). 

Example 1.17 (Transition Path Sampling). In the situation of Examples 11.11 
and 11.61 above, Condition ( ll.4ip and (hence) Assumption 11.141 are satisfied on a 
ball Bj^ with K independent of d provided ||D^(/)(x)|| < 1 — K for any x G B]^, cf. 
f ll.35p . More generally, by modifying the metric in a suitable way if necessary, one 
may expect Assumption II . 141 to hold uniformly in the dimension in neighbourhoods 
of local minima of U. 

1.5. Conclusions. For R G (0, oo), we denote by Wr the Kantorovich-Rubinstein- 
Wasserstein distance based on the distance function 

(1.42) dji^XyX) := min(||x — 2i?). 

Note that d^ is a bounded metric that coincides with the distance function induced 
by the minus norm on B]^. The bounds resulting from Theorems I1.15l and ll.l2l can 
be iterated to obtain estimates for the KRW distance Wr between the distributions 
of the corresponding Metropolis-Hastings chains after n steps w.r.t. two different 
initial distributions. 

Corollary 1.18. Suppose that Assumptions \1.5\ and 1.1 4 are satisfied, and let 
h G (0, 2) and R G (0, oo). Then for any n G N, and for any probability measures 
fi,u on B{R'^), 

Wni^iql^uql) < c,(i?)" >V^(/i, z/) 

+ 2R ■ (P^pfc <n:Xk^B~^]+¥,[3k<n:Xk^ 5^]). 
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Here Ch{R) is the constant in Theorem \1.15\ and (X„, P^) and Py) are Markov 
chains with transition kernel qh and initial distributions fi, v respectively. A cor- 
responding result with Ch replaced by c^^ holds for the Metropolis-Hastings chain 
with Ornstein- Uhlenbeck proposals. 

Since the joint law of Wh{x) and Wh(x) is a coupling of qh{x, ■) and qh{x, ■) for 
any x, x G M*^, Corollary ll.lSl is a direct consequence of Theorems [TTT5l[LT2] respec- 
tively, and Theorem 12.21 below . The corollary can be used to quantify the Wasser- 
stein distance between the distribution of the Metropolis-Hastings chain after n 
steps w.r.t. two different initial distributions. For this purpose, one can estimate 
the exit probabilities from the ball via an argument based on a Lyapunov func- 
tion. For semi-implicit Euler proposals we eventually obtain the following main 
result: 

Theorem 1.19 (Quantitative convergence bound for semi-implicit MALA). 

Suppose that Assumptions 11.51 and\1.14\ are satisfied. Then there exist constants 



C, D,q G (0, oo) such that the estimate 

y\^2R{i^ql, T^ql) < \ ^--rh] W2ij(z/, tt) + D i? exp ( — ) nh 



holds for any n E N, h, R & (0, oo) such that h~~^ > C ■ (1 + Ry, and for any 
probability measures u, tt on with support in B~^. The constants C , D and q can 
be made explicit. They depend only on the values of the constants in Assumptions 
\1.5\ and\1.14\ and on the moments rrik, k E N, w.r.t. the minus norm, but they do 



not depend explicitly on the dimension. 

The proof of Theorem 11.191 is given in Section [7] below. 

Let ^r{A) = fi{A\B]^) denote the conditional probability measure given Bj^. 
Recalling that yU is a stationary distribution for the kernel qh, we can apply The- 
orem 11.191 to derive a bound for the Wasserstein distance of the discretization of 
the MALA chain and fiR after n steps: 

Theorem 1.20. Suppose that Assumptions \l .5\ and \l.l^ are satisfied. Then there 
exist constants C,D,q G (0, oo) that do not depend explicitly on the dimension 
such that the estimate 

l--/ij + Di?expf-^j nh 

holds for any n E f^, h, R E (0, oo) such that h^^ > C ■ (1 + Ry, and for any 
probability measure v on M*^ with support in B]^. 

The proof is given in Section [71 

Given an error bound e G (0, oo) for the Kantorovich-Rubinstein- Wasserstein 
distance, we can now determine how many steps of the MALA chain are required 
such that 

(1.43) y^2R{^qh^ I^r) < ^ for any z/ with support in 5^. 
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Assuming 

/ , 4 , film 

(1.44) nh > — log 



K °\ e 

we have 58 i? (1 — < £^/2. Hence fll.43p holds provided the assumptions in 

Theorem 11.201 are satisfied, and 

(1.45) DRexp{-KRy33)nh < e/2. 

For a minimal choice of n, all conditions are satisfied if R is of order (loge"^)^/^ up 
to a log log correction, and the inverse step size is of order (loge"^)''/^ up to a 
log log correction. Hence if Assumption 11.14] holds on M*^, then a number n of steps 
that is polynomial in loge"^ is sufficient to bound the error by e independently of 
the dimension. 

On the other hand, if Assumption 11.141 is satisfied only on a ball i?^ of given 
radius R, then a given error bound e is definitely achieved only provided fll.45|) 
holds with the minimal choice for nh satisfying fll.44p . i.e., if 

(1.46) 8DK-Hog{ll6Re-^)Rexp{-KR^/33) < e. 

If e is chosen smaller, then the chain may leave the ball -B^ before sufficient mixing 
on i?^ has taken place. 

2. Wasserstein contractivity of Metropolis-Hastings kernels 

In this section, we first consider an arbitrary stochastic kernel q : S x B{S) — )■ 
[0, 1] on a metric space {S, d). Further below, we will choose = M'^ and d{x, y) = 
\\x — A R for some constant R G (0, oo], and we will assume that q is the 
transition kernel of a Metropolis-Hastings chain. 

The Kantorovich- Rubinstein or -Wasserstein distance of two probability mea- 
sures fj, and u on the Borel-cr-algebra B{S) w.r.t. the metric d is defined by 



Wd(yU, z/) = inf / d{x,x) 7]{dxdx) 
■n J 



where the infimum is over all couplings rj of fi and u, i.e., over all probability 
measures 77 on S" x S* with marginals fi and u, cf. e.g. [Villani]. In order to 
derive upper bounds for the Kantorovich distances Wd(/xg, z/g), and more generally, 
Wdifiq"', i'q"'),n G N, we construct couplings between the measures q{x, ■) for x G 
S, and we derive bounds for the distances Wd{q{x, ■), q(x, ■)),x, x E S. 

Definition 2.1. A Markovian coupling of the probability measures q{x, ■),x G S, 
is a stochastic kernel c on the product space [S x S,B{S x S)) such that for any 
x,x E S, the distribution of the first and second component under c{{x, x), dy dy) 
is q{x, dy) and q(x, dy) respectively. 

Example. 1) Suppose that (f2, A, P) is a probability space and (x, x, uj) t-j- Y{x, x){(jj), 
(x, X, uj) H- F(x, x){uj) are product measurable functions from S x S xVtto S such 
that F(x,x) ~ q{x,-) and F(x,x) ~ qix,-) w.r.t P for any x,x G S. Then the 
joint distributions 

c((x,x),-) = P o (F(x, x), F(x, x))""^, x,x E S, 
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define a Markovian coupling of tfie measures q{x, ■),x G S. 

2) In particular, if {x,u) H- Y{x){u!) is a product measurable function from S xQ 
to S such that Y{x) ~ q{x, ■) for any x G 5 then 

c((x,x),-) = Po 

is a Markovian coupling of the measures q{x, ■), x G S*. 

Suppose that X„) on (l), P) is a Markov chain with values in 5" x 5" and 
transition kernel c, where c is a Markovian coupling w.r.t. the kernel q. Then the 
components [Xn) and [Xn) are Markov chains with transition kernel q and initial 
distributions given by the marginals of the initial distribution of i.e., 
{Xn, Xn) is a coupling of these Markov chains. We will apply the following general 
theorem to quantify the deviation from equilibrium after n steps of the Markov 
chain with transition kernel q: 

Theorem 2.2. Let 7 G (0, 1) and let c{{x,x),dy dy) be a Markovian coupling of 
the probability measures q{x, ■), x G S*. Suppose that O is an open subset of S, and 
assume that the metric d is bounded. Let 

A := diamS = sup{d{x,x) : x,x ^ S} . 

If the contractivity condition 

(2.1) / d{y,y) c{{x,x),dydy) < 'j ■ d{x,x) 



holds for any x,x & O, then 
(2.2) 

W,(/ig", i/g") < 7">V,(/i, u) + A ■ {F^\3k < n : X,, ^ O] + K[3k <n:Xk^O]) 

for any n G N and for any probability measures fi, v on B{S). Here {Xn, P^) and 
{Xn, Pj/) are Markov chains with transition kernel q and initial distributions fi, v 
respectively. 

Proof of Theorem 12. 2[ Suppose that /i and v are probability measures on B{S) 
and r]{dxdx) is a coupling of \i and v. We consider the coupling chain (X„, X„) on 
(fi. A, P) with initial distribution ri and transition kernel c. Since {X^) and {X^) 
are Markov chains with transition kernel q and initial distributions /i and v, we 
have P o = /ig" and P o = vq^ for any n G N. Moreover, by (12.1)1 . 



E 



d{Xn, Xn) ; (Xfe, Xfe) G O X O V A; < n 
= E 



< 7E 



(i(x„,x„)c((X„_i,X„_i),(ix„dx„) ; (Xfc,Xfc) GCxCVA;<n 
rf(X„_i,X„_i) ; (X,,Xfc) GOxOVA;<n-l 
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Therefore, by induction, 



E 



d(X„, Xn) ; {Xk, Xk)eOxO\fk<n 



+ E 



< 7"d(a;, J) + A • P 3k < n : (X^, X^) ^ C x C 
which imphes (12. 2p . 



d(X,, XO ; 3A: < n : (X^, X^) ^ C x (9 



□ 



Remark 2.3. Theorem 12.21 may also be useful for studying local equilibration 
of a Markov chain within a metastable state. In fact, if (9 is a region of the 
state space where the process stays with high probability for a long time, and 
if a contractivity condition holds on O, then the result can be used to bound 
the Kantorovich-Rubinstein-Wasserstein distance between the distribution after a 
finite number of steps and the stationary distribution conditioned to O. 

From now on, we assume that we are given a Markovian coupling of the pro- 
posal distributions p{x,-),x G W^, of a Metropolis-Hastings algorithm which is 
realized by product measurable functions {x,x,uj) i— )■ Y{x,x){uj),Y{x,x){uj) on a 
probability space (fi. A, P) such that 



Y{x,x) ~ p{x,-) and Y{x,x) ~ p{x,-) for any x, x G 



Let a{x,y) and q{x,dy) again denote the acceptance probabilities and the tran- 
sition kernel of the Metropolis-Hastings chain with stationary distribution /i, cf. 
f ll.lOp . f ll.lip and fll.l2p . Moreover, suppose that W is a uniformly distributed 
random variable with values in (0, 1) that is independent of {Y{x, x) : x,x E M'^}. 
Then the functions {x,x,u) ^-)■ W{x,x){u),W{x,x){u) defined by 

if W < Y{x, x)) 
ifU> a{x, Y{x, x)) 

iilA < Y{x, x)) 
iiU > x)) 

realize a Markovian coupling between the Metropolis-Hastings transition functions 
g(x, ■), X G M'^, i.e., 

W{x,x) ~ and W{x,x) ~ qix,-) 

for any x, x G W^. This coupling is optimal in the acceptance step in the sense that 
it minimizes the probability that a proposed move from x to y(x,x) is accepted 
and the corresponding proposed move from x to Y{x,x) is rejected or vice versa. 



iy(x, x) : = 
Ty(x,x) : = 



F(x, x) 

X 



y(x, x) 



X 
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Lemma 2.4 (Basic contractivity lemma for MH kernels). For any x,x E M^, 

E[d{W{x,x),W{x,x))] 
< E[(i(F(x,x),F(x,x))] 

+ E[((i(a;, x) — d(Y{x, x),Y{x, x))) ■ max(l — a{x, Y{x, x)), 1 — a{x, Y{x, x)))] 
+ K[d{x, Y{x, x) ■ {a{x, Y{x, x)) — a{x, Y{x, x)))^] 
+ K[d(x, Y{x, x) ■ {a{x, Y{x, x)) — a{x, Y{x, x)))~]. 

Proof. By definition of W and by tlie triangle inequality, we obtain the estimate 

E[d{W{x,x),W{x,x))] 
< E,[d(Y{x,x),Y{x,x)) ; U < mm{a{x,Y{x,x)),a(x,Y{x,x)))] 
+d{x, x) ■ P[W > min(a(x, Y{x, x)), aix, Y{x, x)))] 
+E[(i(x, Y{x, x)) ; a{x, Y{x, x)) <U < Y{x, x))] 
+E[(i(5;, Y{x, x)) ; Y{x, x)) <U < 5;))]. 

The assertion now follows by conditioning on Y and Y . □ 

Remark 2.5. (1) Note that the upper bound in Lemma [531 is close to an equality. 
Indeed, the only estimate in the proof is the triangle inequality that has been 
applied to bound d{x, Y) by d{x, x) + d{x, Y) and d{x, Y) by d{x, x) + d{x, Y). 

(2) For the couplings and distances considered in this paper, d{Y., Y) will always 
be deterministic. Therefore, the upper bound in the lemma simplifies to 

¥\d{W,W)] < rf(F,F) + (rf(x,x)-rf(F,F))-E[max(l-a(x,F),l-a(x,F))] 
(2.3) + E[d{x, Y){a{x, Y) - a{x, Y))+ + d{x, Y){a{x, Y) - a{x, ?))"]. 

Here E[max(l — a{x,Y), 1 — a{x,Y))] is the probability that at least one of the 
proposals is rejected. 

(3) If the metric d is bounded with diameter A then the last two expectations 
in the upper bound in Lemma 12.41 can be estimated by A times the probability 
E[|a;(x, Y) —a{x, Y)\] that one of the proposals is rejected and the other one is ac- 
cepted. Alternatively (and usually more efficiently), these terms can be estimated 
by Holder's inequality. 

3. Contractivity of the proposal step 

In this section we assume V G C^(R'^). We study contractivity properties of the 
Metropolis-Hastings proposals defined in fll.23p and fll.27p . 

Note first that the Ornstein-Uhlenbeck proposals do not depend on V. For 
h G (0,2), the contractivity condition 

(3.1) ||n°"(x)-Fr(5^)|| = \\{l-h/2){x-x)\\ = (1-V2)||a:-^|| 
holds pointwise for any x,x G M*^ w.r.t. an arbitrary norm || ■ || on Mf^. 
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Yh{x) 



X 



-VU{x) 



h 



7 



2 ' ' V 4 ' 
Wasserstein contractivity does not necessarily hold. Close to optimal sufficient 
conditions for contractivity w.r.t. the minus norm can be obtained in a straight- 
forward way by considering the derivative of w.r.t. x. 

Lemma 3.1. Let h G (0,2) and let C be a convex subset ofM.'^. If there exists a 
constant A G (0, oo) such that 



(3.2) 

then 

Proof. If 



h 



V^U{x] 



\\Yh{x)-YH{x) 
holds, then 



\\drM^)\\^ = 

for any z G C and r/ G M'^. Hence 
\\Yh{x)-Yh{x)\\. -- 



< \\\ri\\- for any rj eR'^,x e C, 
< A||x — for any x,x E C. 

< M\v\\- 



rj — — V^t/(x) ■ rj 







dt 



Yh{tx 



t)x) dt 



< 

< X\\x 



[ \\d,.^Yh{tx + {l-t)x)\\_ dt 
Jo 

;||_ for x,x E C. 



□ 



Remark 3.2. (1) Note that condition (13.21) requires a bound on V^f/ in both 
directions. This is in contrast to the continuous time case where a lower bound by 
a strictly positive constant is sufficient to guarantee contractivity of the derivative 
flow. 

(2) Condition (13. 2 p is equivalent to 

(3.3) ^.rj-^dlU{x) < Allell + ll^ll- for any xGC,e,r/GM'^. 

Recall that for R G (0, oo], 

(3.4) M{R) = sup{|| V2f/(x) ■ : r/ G 5f , x G B^}. 

Hence M{R) bounds the second derivative of U on B]^ in both directions, 
whereas the constant K in Assumption 11.141 is a strictly positive lower bound 
for the second derivative. We also define 

(3.5) N{R) = sup{||VV(x) ■ 7]\\_ : r/ G Ef, x G S"}. 

Note that M{R) < 1 + N{R). As a consequence of Lemma [3. II we obtain: 
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Proposition 3.3. For any h G (0, 2) and x,x E Bj^, 



(3.6) 



\YJx)-YJx)\\ < 1 



N{R) 



h 



\x — x\ 



Moreover, if Assumption \Tl4\ holds then 



(3.7) \\Y,.{x)-Yt(I) 

Proof. Note that for z E [x,x\ and ri G 
(3.8) 

Therefore, by (13. 5[) . 



\x — x\ 



(I - ^V^[/(^)) ■ V 



ll(/-oV^^W)-r/| 



< (1 



h 



h 

2 ■ " - 2 

The inequahty (13. 6p now follows by Lemma [3. II 

Moreover, if Assumption 11.141 holds then for z E [x,x\ and t] G 



{I-^^'Uiz)).r^\\l 



\r]f_ - h{r], V^U{z) ■ 7]) + —\\V^U{z) ■ 



< {l-Kh + M{Rfh^/A)Ml = (1 - Ji V2 + M(i?)2/iV8) ||r/| 
The inequality (13. 7p again follows by Lemma 13.11 

4. Upper bounds for rejection probabilities 



□ 



In this section we derive the upper bounds for the MH rejection probabilities 
stated in Proposition 11.71 As a first step we prove the explicit formula for the 
MALA acceptance probabilities w.r.t. explicit and semi-implicit Euler proposals 
stated in Proposition 11.31 

Proof of Proposition [7751 For explicit Euler proposals with given step size /i > 0, 



\og-i\x)pT'^\x,y) 



-\xr + — 
2' ' 2h 



y 



1-^1 x+'-\/V{x] 



X\ 



C 



hx ■ y + + h{y — x) ■ W{x) 



4 



-h^x-VV{x) + ^h^\VV{x)\^ 



c 



= S{x, y) + ^{y-x)- VF(x) + + V\/(x)|2 
with a normalization constant C that does not depend on x and y, and a symmetric 



function S : 



X 



Therefore, by (I1.19p . 



^Eulcr I 



x,y) 



V(y) - Vix) + \ogYix)pf-'^^ix,y) - log7'^(z/)pr°^(z/,x) 
V{y) - V{x) -{y-x)- {yV{y) + VV{x))/2 
+h{\y + VV{y)\^ -\x + VV(x)n/8. 
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Similarly, for semi-implicit Euler proposals we obtain 



1 1 

logY{x)Phix,y) = 2 1^1^ +2 



y 



h 



h. 



1 - - \x + -VV{x] 



1 
2 



h 1 |x| 

4 



h 



h. 



y- [I X + -VV{x 



2 A-h 



X + yv{x) 



' ' ' '^^ ' 2 A-h 
S{x, y) + \{y - x) ■ VV{x) + [{y + x) ■ \/V{x) + | VV(x)p] 



24-/1 



and, therefore. 



= l^(|/)-\/(x)-(y-x)-(Vy(2/) + Vy(x))/2 

^ [(y + x) ■ {VV{y) - Vy(x)) + I Vr(y)p - I VV(x)p] . 



8-2/1 



□ 



From now on we assume that Assumption 11.51 holds. We will derive upper 
bounds for the functions Gh{x,y) computed in Proposition II. 3[ By fll.l6p . these 
directly imply corresponding upper bounds for the MALA rejection probabilities. 
For x,x eM'^ and n = 1, 2, 3, 4 let 

(4.1) L„ix,x) = snp{{dl^^^^^^^y){z) : z G • • • e }. 

In other words. 



Ln{x, x) 



sup \\{d'^v){z)r_ 

z(i[x,x\ 



where 



is the dual norm on n-forms defined by 

= sup{/(^i,...,^„) : G Sf}. 



In particular, 



Li{x,x) = sup ||Vy(z) 



By Assumption 11.51 

(4.2) Lnix,x) < C„ ■ max(l, ||a;||_, ||x||_)P" 



y x,x eW,n = 1,2,3,4. 



We now derive upper bounds for the terms in the expression for Gh{x, y) given in 
Proposition 11.31 We first express the leading order term in terms of 3'''^ derivatives 



Lemma 4.1. For x^y G M'^, 

V{y) - V{x) - ■ {VV{y) + VV{x)) 



t{l-t)df._^V{{l-t)x + ty) dt. 
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Proof. A second order expansion for f(t) = V{x + t{y — x)),t G [0, 1], yields 

Viy)-Vix) = f dy^,V{x + t{y-x))dt 
Jo 

-1 rt 



{y-x)-VV{x)+ [ f dl_^V{x + s{y-x))dsdt 
Jo Jo 

iy-x)- VV{x) + [ (1 - s) dl_^V{x + s{y - x)) ds, 



and, similarly, 

V{y)-V{x) = iy-x)-VViy)- [ [ dl_,V {x + s{y - x)) ds dt 

Jo Jt 

= [y-x) ■ W{y) - s dl_^V{x + s{y - x)) ds. 

By averaging both equations, we obtain 

— 1 

V{y) - Vix) - ^ ■ {VV{y) + VV{x)) = 3 / " 2^)^J-^(^ + 'iv " ^)) 



t{l - t)d^_.^V{x + t{y - x)) dt. 



Here we have used that for any function g G C^([0, 1]), 

/ {l-2s)g{s)ds = / (l-2s) / g\t) dt ds 
Jo Jo Jo 



[ [ {l-2s)dsg'{t)dt = - [ t{l - t)g'{t) dt. 
Jo Jt Jo 



□ 

Lemma 4.2. For x,y G M"', the following estimates hold: 

(1) \V{y) -V{x)\ < L^{x,y) ■ \\y - x\\^, 

(2) \V{y)-V{x)- '^-^-{VV{y) + VV{x))\ < ^L3{x,y) ■ \\y - xf_, 

(3) \{VU{y) + Vf/(x)) ■ {VV{y) - VV{x))\ 

< L2(x, y) ■ \\VU{y) + VU{x)\\. ■ \\y - x\\., 

(4) \\VU{y) + VU{x)\\^ < 2\\VU{x)\\. + {l + L2{x,y))-\\y-x\\^. 

Remark 4.3. The estimates in Lemma [4.21 provide a bound for the terms in the 
expression (11.321) for Gh{x,y) in the case of semi-implicit Euler proposals. For 
explicit Euler proposals, one also has to bound the term 

|Vf/(y)p- |Vf/(x)|2 = \y + VV{y)\^-\x + VV{x)\^ 

Note that even when V vanishes, this term can not be controlled in terms of || ■ ||_ 
in general. A valid upper bound is 

\VU{y) + VU{x)\ ■\y-x\+ L2{x,y)\\VU{y) + VU{x)\\. ■ \\y - 
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Proof of Lemma \4 ■ S\ By Lemma [4.11 and by definition of Ln{x, y), we obtain 

\V{y)-V{x)\ < sup \dy.^V{z)\ < L^{x,y)■\\y-x\\^, 



viv) - vix) - ^—^ ■ (wviv) + vy(x)) 



1 

< - 

- 2 



[ t{l-t)dt sup \dl_,V{z) 

Jo z£\x,y] 



ze[x,y] 



{VU{y) + VU{x))-{VV{y)-VV{x)) = d^uiy)+^uix)V{y) - d^uiy)+vuix)V{x) 

^vf/(j/)+vf/(x),3/-x'^((i - t)x + ty) dt 

< L^ix, y) ■ \\VU{y) + VU{x)\\. ■ \\y - x\\.. 

Moreover, tlie estimate 

||Vf/(y) + Vf/(a;)||_ < 2||Vt/(x)||_ + ||Vf/(y)-Vf/(a;)||„ 

< 2||Vf/(x)||_ + \\y - x\\^ + \\VV{y) - VV{x)\\. 

< 2\\WU{x)\\. + {l + L^{x,y))-\\y-x\\_ 

liolds by definition of L2{x,y) and since 

\\VV{y)-VV{x)\\. < \VV{y)-VV{x)\ = snp{d^V{y)-d^V{x)) 

l«l=i 

< sup {d^V{y) -d^V{x)). 
Il«ll~<i 

□ 

Recalling the definitions of Y^^{x) and Yh{x) from (11.231) and fll.27p . we obtain: 

Lemma 4.4. For x E M*^, h E (0, 2) and n E {1, 2, 3, 4} with Pn > I, we have: 

(1) \\YOU(x) - x\\^ < |||x||_ + v^||Z||_, 

(2) \\Y,{x)-x\\. < |||VL/(a;)||_ + v^||Z||_, 

(3) < (l-|)||x||_ + v^||Z||_, 

(4) \\Yh{x)\\. < ||x||_ + |||V[/(a:)||_ + v^||Z||_, 

(5) Ln{x,Y^^^{x)) < C„2P"-^ (max(l, ||a:||_)P" + /i^'-Z^llZll^") 

(6) Ln{x,Yh{x)) < Cn3P"-' (max(l, \\x\\^y- + (|)^" ||Vf/(x)||^" + , 

Proof. Estimates (l)-(4) are direct consequences of the triangle inequality. More- 
over, by (3) and (4), 

max(l, < max(l, + v^||Z||_, and 

max(l, ||x \\^,\\Yh{x)\\^) < max(l,||x||_) + -||Vt/(x)||„ + v^||Z||_. 

Estimates (5) and (6) now follow from (14.21) and Holder's inequality. □ 

We now combine the estimates in Lemma 14.21 and Lemma 14.41 in order to prove 
Proposition 11.71 and the first part of Proposition 11.111 
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Proof of Proposition [777[ By (11.161) and Proposition II .Sj for h G (0, 2), 



(4.3) E[(l-a,(x,n(x)))^]'/' < ||G^(x,n(x))^ 
where 



I 

II 



V{Y,{x)) - V{x) - YlM-J^ . {VV{Y,{x)) - VV{x)) 
||(Vf/(n(x)) + Vf/(x)) ■ (VV(n(x)) - VV(x))||^. . 



By Lemma [4.21 

(4.4) / < E[L3(x,F,(x))^-||n(x)-xf_'=]'/Vl2, and 
// < E [L2{x,Yhix))'' ■\\Yhix)-x\\'l 

(4.5) x(2||Vt/(x)||_ + (1 + L2(x,n(x))) ■ lin(x) -a;||_)'=]'/' 



The assertion of Proposition 11.71 for semi-implicit Euler proposals is now a direct 
consequence of the estimates (2) and (6) in Lemma The assertion for Ornstein- 
Uhlenbeck proposals follows similarly from (11.211) and the estimates (1) in Lemma 
and (1), (3) and (5) in Lemma [4.41 □ 



It is possible to write down the polynomial in Proposition 11.71 explicitly. For 
semi- implicit Euler proposals, we illustrate this in the case k = 1 and P2 = Pa = 0. 
Here, by (14.41) and (14.51) we obtain 



I < 
- 12 

// < CoE 



C, 



< ^{h^VU{x)\\l/8 + h^/^m:,) 



h\\VU{x)\\^/2 + Vh\\Z\ 

h\\VUix)\\^/2 + Vh\\Z\\ 

X f2||Vf/(x)||_ + (1 + C2) (/i||Vf/(x)||„/2 + Vh\\Z 



< C2 [h\\VUix)\\^ + 2v^|| Vt/(x)||_mi + (1 + C2)(y II Vf/(x)f„ + 2/im2; 
Hence by fO|) . 

E[l-ah{x,Yf,{x))] < h'/^-(^^Csms + ^C2mi\\VU{x)\\^^ 



(4.6) 



^C2||Vf/(x)f_ + 1^2(1 + C2)m2 
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C4VU{x)\t + lc2{l + C2)\\VU{x)\t 



For Ornstein-Uhlenbeck proposals, we derive the explicit bound for the rejection 
probabilities stated in Proposition II. Ill for the case k = 1 and p2 = 0. 

Proof of Proposition M . 1 1[ first part. If = then for any x G M'^, 

(4.7) ||Vy(x)||+ < ||VV(0)||+ + ||VV(x) - VV(0)||+ < Ci + C2- ||x||_. 
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Therefore, for any x, y G Mf^, 

\Viy)-Vix)\ < (Ci + C2-max(||x||„,||y||„))-||y-x||_. 
Hence, by (11.211) and by (1) and (3) in Lemma [4.41 

E [1 - < E [{V{Y,^'^{x)) - V{x)r] 

< E [{C^ + C, ■ max(||a:||_, \\Y^''{x)\\^)) ■ \\Y,^'' (x) - a:||_] 

< E [(^Ci + C2 ■ + v^ll^lU)) • (h\\x\\^/2 + v^ll^ll-) 
= mi(Ci + C2||x||_)-/i^/2 

+ ^(2m2C2 + + C2\\x\\l) ■ h + ^miC2||x||_ ■ h^/''. 



□ 



5. Dependence of rejection on the current state 
We now derive estimates for the derivatives of the functions 

(5.1) Fhix,w) = Ghix,x-^\/Uix)+w), 

(5.2) F^^{x,w) = G°^{x, x-^x + w), {x,w)eR'^xR 
w.r.t. X. Since 



/j2 

(5.3) Gh{x,Yh{x)) = Fh{x,\lh-—Z) with Z ~ 7^ and 



(5.4) G«U(^^yOU(^)) ^ F^^^^^^h-^Z) withZ~7', 

these estimates can then be apphed to control the dependence of rejection on the 
current state x. 

For Ornstein-Uhlenbeck proposals, by (11.211) . we immediately obtain 

(5.5) V.fOU(;,, ^) = (1 - ^) {VV{y) - VV{x)) - ^VV{x), 

where ?/ := (1 — |) x + w. 

For semi-implicit Euler proposals, the formula for the derivative is more involved. 
To simplify the notation we set for x G M'' and fixed h G (0, 2): 

(5.6) x' := x~^VU{x). 
In the sequel, we use the conventions 

d d d d 

v-w = ^^ViWi, {v-T)j = '^ViTij, {T-v)j = ^TijVj, {S-T)ik = ^ SijTj^k 

i=l i=l j=l j=l 

for vectors v,w ^M.'^ and (2, 0) tensors S*, T G M'' ® M*^. In particular, 

v{S-T) = iv-S)-T, 
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i.e., the brackets may be omitted. We now give an explicit expression for the 
derivative of Fh.{x,w) w.r.t. the first variable: 

Proposition 5.1. Suppose V G C^(M°'). Then for any x,w & W^, 
V,F,(x, w) = VV{y) - VV{x) - ^ ■ {V'V{y) + V'V{x)) 
+^{y-x)-V'V{y)-{I, + V'V{x)) 

+^^(Vr(y) - VV{x) + VU{y) + Vt/(x)) ■ ( W(y) - VV(x)) 

(5.7) - Y^^(VV^(?/) - VV^(x) + Vt/(y) + WU{x)) ■ VV(y) ■ (Id + W(x)) 

wi/i y := x' + w = X — |Vf/ (x) + w. 
Proof. Let 

:= VV^(x) = Vf/(x)-x, xER'^. 

By Proposition II. 3^ 

(5.8) Fh{x,'w) = Ah{x,'w) + - — —Bh{x,w) for any x, w G M'^, where 

8 — 2h 

x' ~\~ w — X 

Ah{x,w) := V{x' + w)-V{x) {W{x' + w) + W{x)), and 

Bh{x,w) := (Vf/(x' + w;) + Vt/(x)) ■ (Vy(x' + w) - Vy(x)). 
Noting that by dSS]), 

(5.9) x-x' = ^Vf/(x) = ■^x + ^V\^(x), 

(5.10) V.(x-xO = ^V't/(x) = ^/, + ^VV(x), and 

(5.11) V.x' = /,-^V^t/(x) = (l-^)/,-^VV(x), 
we obtain with y = x' + w: 

V,Ah{x,w) = W{x' + w)-{h-^S/^U{x))-W{x) 

_x^+|^ ■ [wW{x' + w) ■ {h - \v^U{x)) + Vl^(x) j 

+^(Py(x +w) + W{x)) ■ V^U{x) 
= W{y) -W{x) -^^-^ ■ {VW{y) + VW{x)) 

-^{W{y) - W{x)) ■ V^U{x) + ^{y - x) ■ {yW{y) ■ V'U{x)), 
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V^Bh{x,w) = {W{x' + w) -W{x)) ■ (v^U{x' + w) ■ {Id-^V^U{x)) +V^U{x] 

+{VU{x' + w) + VU{x)) ■ (vW{x' + w) ■ {Id - ■^V^t/(x)) - VW{x)^ 
= {W{y) - W{x)) ■ {V^U{y) + V^U{x)) + {VU{y) + VU{x)) ■ {VW{y) - VW{x)) 
- ^{W{y) - W{x)) ■ {V'U{y) ■ V^U{x)) - |(Vf/(y) + Vf/(x)) ■ {VW{y) ■ V^U{x)). 
In total, we obtain 

V^Fh{x,w) = V^Ah{x,w) + -^^^V^Bh{x,w) 
= W{y)-W{x)-^^-^-{VW{y) + VW{x)) 

^ {W{y) - W{x)) ■ iV^U{y) - V'^U{x)) + ^{y - x) ■ VW{y) ■ V^U{x) 



8 — 2h 

7] iW{y)-W{x))-V^U{x) 



8-2h 4 



16-4/i 
16-4/1 



{W{y) - W{x)) ■ V'^U{y) ■ V^U{x) 
{VU{y) + VU{x)) ■ VW{y) ■ V^U{x) 



= W{y) - W{x) - ■ {VW{y) + VW{x)) + ^{y - x) ■ VW{y) ■ V^U{x) 

+ FAT7-(W^(y) - W{x) + VU{y) + Vf/(x)) ■ (VWiy) - VW{x)) 
8 — 2h 

{{W{y) - W{x)) ■ {VUiy) - h) + {VU{y) + VU{x)) ■ VW{y)) ■ V^U{x). 



16 -4/1 
Here we have used that 

V'f/ = h + vV = h + vw. 

The assertion follows by applying this identity to the remaining V^t/ terms as 
well. □ 



Similarly to Lemma 14.21 above, we now derive bounds for the individual sum- 
mands in the expressions for VxF^^ and V xFh in (15.51) and Proposition 15.11 

Lemma 5.2. For V G C^(]R'^) and x,y EMf^ the following estimates hold: 

(1) \\VV{y)-VV{x)\\_^ < L2{x,y)-\\y-x\\^, 

(2) ||vr(2/)-Vy(x)-^-(W(y) + W(x))||^ < ^L,{x,y)-\\y-xr_, 

(3) \\{y-x)-V^V{y)-{Id + V^V{xm+ < Laly, y) ■ (1 + L2(x, x)) ■ ||y - 
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(4) UVV{y) - VV{x) + VU{y) + VU{x)) ■ {V^V{y) - W(x))||h. 

< Ls{x, y) ■ (L2(x, y)\\y - + \\VU{y) + Vf/(x)||_) ■ \\y - x\\., 

(5) \\iyV{y) - ^V[x) + Vf/(2/) + Vf/(x)) ■ V2V-(y) ■ (Id + W(x))||_ 

< L2(2/, y) ■ (L2(x, - + II Vt/(2/) + V[/(a;)||_) ■ (1 + L^[x, x)). 

Proof. (1) For any C, G M*^, we have 

\d^V{y)-d^Vix)\ < sup |9j_.,5n^)| < L2(x,2/)||x-2/||_||e||-. 

ze[x,y] 

This proves (1) by definition of || ■ ||+ . 

(2) By Lemma HIT] apphed to d^V, 

\d^Viy) - d^Vix) - ^ ■ (Vd^Viy) - Vd^Vix))\ 

< I ft{l-t)dt- sup \dl_,d^V{z)\ < lL4(a;,y)||x-2/f_||e||-. 

(3) For v,w E M*^, we have 

(5.12) \vV'V{y)w\ = \dl^V{y)\ < L2(x, y)||i;||_||z/;||_. 

Since || ■ ||- is weaker than the Euchdean norm, we obtain 
\{y-x)-V'V{y)-{I + V^V{x))-i\ < L2(y,y)|b - x||„||(/ + VV(x)) ■ e||- 

< L2(2/,i/)||t/-x||_(l + L2(a;,x))||e||-. 

(4) , (5) For v,w e M'^, 

^ dl-x,.,^V{{l-t)x + ty)dt 
< ^3(a;,y)||2/ - x||_||w||_||m;||_. 

Therefore, 

\{VV{y) - VV{x) + VU{y) + Vf/(x)) ■ ( W(y) - VV(x)) ■ ^1 

< Ls{x,y)\\y - x||_ • {\\VV{y) - VV{x)\\^ + \\VU{y) + Vf/(x)||„) ■ ||e||- 

< L-s{x, y)\\y - x||„ ■ {L^ix, y)\\y - x||„ + \\VU{y) + Vf/(x)||_) ■ ||^||_, 
and, correspondingly, 

\{VV{y) - VV{x) + VU{y) + VU{x)) ■ VV(?/) ■ (/ + V^V{x)) ■ ^\ 
< L,{y,y) ■ {L2{x,y)\\y - x\\^ + \\VU{y) + VU{x)U ■ (1 + L2(x, x)) ||e||_. 

□ 



It; - (VV(?/) - VV(x)) -wl 



By combining Proposition 15. II with the estimates in Lemma [52] and Lemma |43| 
we will now prove Proposition II. 9[ 

Proof of Proposition\rg Fix h G (0,2). By ffTTTD and ffLTSD . for any x,x G M'^, 

\\ah{x,Yh{x)) - ah{x,Yh{x))\\^k < \\Gh{x,Yh{x)) - Gh{x,Yh{x))\\^k 
(5.13) < ||a;-^||_- sup || || V.G,.(a;, ^/.(x)) || + 1|^. 
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Moreover, by fl5.3p and Proposition I5.1[ 



||||V,G,(x,ll(x))||+||^. = II ||V,F,(x, ^h-hyAZ)\U\\L, 

(5.14) < I+--II+ ^ -111+ — ^^-—-IV 

^ ^ - 4 8 - 2/i 16-4/1 

where 

/ = E[||VV^(F,(x))-Vr(x)-^(n(x)-x)-(VV(n(x)) + VV(x))||t]'/', 



// = E[||(n(x)-x)-VV(n(x))-(/ + VV(a:) 



i+j ' 



III = E[\\{VV{Yhix))-VV{x) + VU{Yhix)) + VU{x)) 



IV = E[\\ (VViYhix)) - VV{x) + VU{Yh{x)) + Vf/(x)) ■ 

■vV(y,(x))-(/ + vV(x))||t]'/^ 

By applying the estimates from Lemma [5.21 and Lemma [4.21 (4). we obtain 

(5.15) / < ^E[Ux,Y,{x)nY,{x)-xr_'Y^\ 

(5.16) // < (l + L2(x,x))-E[L2(n,(x),n(x))^||n,(x)-x||^]'/', 

(5.17) /// < E[L,{x,Yh{x))''\\Yf,{x)-x\\'l 

x((l + 2L2(x,F,(x)))'=||F,(x)-xf: + 2||Vf/(a;)||^)]'/' 

(5.18) /r < (l + L2(x,x))-E[L2(n,(x),n(x))^ 

x((l + L2(x, F,(x)))in(x) - x||^ + 2\\VU{x)\\'l)] . 

The assertion for semi-imphcit Euler proposals is now a direct consequence of the 
estimates in Lemma [4.41 (14.21) and (15.131) . 

The assertion for Ornstein-Uhlenbeck proposals follows in a similar way from 
(15. 5p . Lemma [5.21 (1), and the estimates in Lemma [4.41 □ 



Again, it is possible to write down the polynomial in Proposition 11.91 explicitly. 
For semi-implicit Euler proposals, we illustrate this in the case k = 1 and p2 = 
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p^=p^ = 0. Here, by fl57[5|l . f lCTjl . flHTrjl and (KT^ we obtain 



- 12 



||Vf/(a:)||_ + V^||Z||_ 



< 



■||Vf/(x)||3 +/,3/2^3^ 



II < iC2 + Cl)E 



^||Vt/(x)||_ + v^||Z| 



(C2 + C|)(^||Vf/(x 



III < Ca (^2||Vf/(a;)||_ -E 



^||Vf/(x)||_ + v^||Z||_ 



+ (1 + 2C2)E 

'h 



h 



||Vt/(x)||_ + v^||Z||_ 



< 2C3||Vt/(x)||, ( -\\VU{x)\\^ + Vh7m)+Cs{l + 2C: 

h 



2n yl|Vf/(x)f + 2/im, 



IV < (1 + C2)C2 (^(1 + C2)E 
< 2(l + C2)C2||Vt/(x)||_ 



-||Vf/(x)||„ + v^||Z||„ 



2||Vf/(x)|| 
+ Vhrrii 



Hence by flCTjl . for h e (0, 2), 



E[||V.G'/,(a:,r;,(x) 



< -/i'/'(C4m3 + (1 + C2)C2mi + 2C3||Vf/(x)||_mi) 



+-h\4C2{l + 2C2)m2 + 3C2(1 + Ca)!! Vf/(x)||_ - 

o 

(5.19) +l^h'/'C2{l + C2)2(2mi + h'/^VU{x)U 
lb 

+j^h' {4Cs{l + 2C,)\\VU{x)r_ + C4VU{x)r_) 



2Cs\\VU{x) 



For Ornstein- Uhlenbeck proposals, we prove the explicit bound for the depen- 
dence of the rejection probabihties on the current state for the case k = 2 and 
P2 = as stated in Proposition 11.111 . 

Proof of Proposition im\ second part. If p2 = then by (15.41) . (15.51) and ([47 



h. 



\\V.^G^^{x,Yr{x))\W < \\VViYrix))-VV{x)\W + -\\VVix)\U 

< C2||n°U(x) - + {C^ + CM.) h/2 

< C2\\Z\\.h'/^ + {Ci + 2C2\\x\\.)h/2 

for any x G M°'. Therefore, 

E[||V.GOU(x,yr(^))lll]'^' < C2mf /iV2 + (Ci + 2C2||x||_)V2. 
The assertion now follows similarly to (I5.13p . 



□ 
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6. Upper bound for exit probabilities 

In this section, we prove an upper bound for the exit probabihties of the MALA 
chain from the ball 5^ that is required in the proof of Theorem I1.19[ cf. [12] for 
a detailed proof of a more general result. Let 

(6.1) f{x) := exp (A'||a;|| ^16) • 

The following lemma shows that f{x) acts as a Lyapunov function for the MALA 
transition kernel on B^: 

Lemma 6.1. Suppose that A ssumptions \1.5\ and \l.l^ hold. Then there exist con- 
stants Ci,C2,Pi G (0, oo) such that 

(6.2) g,/ < /i-^V4gC.h ^^j^- 
for any R,he (0, oo) such that > Ci(l + RYk 

Proof. We first observe that a corresponding bound holds for the proposal kernel 
Ph- Indeed, by f ll.27p . and since = v ■ Gv with a non-negative definite 

symmetric matrix G < /, an explicit computation yields 



{pj){x) = E 



exp(i^||a; - ^Vf/(x) + ^/h - hy4:Zf_/ 16) 



< exp (^K{1 + Kh/A)\\x - ^VU{x)\\'i/16 
Moreover, by Assumption 11.141 

\\x-^vu{x)r_ < {i-^)\\x\t + ^\\vumt + ^\m{x)\\i. 

Hence by Assumption 11.51 there exist constants 6*3,(74, p2 G (0, oo) such that 

{Phf){x) < fixf-^^/'e''-^^ 

for any x E and h G (0, cxd) such that > 6*4(1 + ||x||_)^2. By the upper 
bound for the rejection probabilities in Proposition II. 7[ we conclude that there 
exists a polynomial s such that the corresponding upper bound 

^ jl-Kh/i^{C3+l)h 

holds on whenever both h'^ > dil + R)P'' and s{R)h^/^ f^^/^ < 1. The 

assertion follows, since the second condition is satisfied if K'^hR'^/GA < 1 and 

s{R)eh^/^ < 1. □ 

Now consider the first exit time 

Tn ■= inf{n>0 : X„ ^ 5^}, 

where (X^,^^.) is the Markov chain with transition kernel qh and initial condition 
Xq = X P^-a.s. We can estimate Tr by constructing a supermartingale based on 
the Lyapunov function /: 
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Theorem 6.2. If Assumptions\L^and\Ll^hold then there exist constants C, p,D G 
(0, oo) such that the upper bound 

(6.3) Px[TR<n] < Dnh exp[K{\\x\\l - R^)/24] 

holds for any n > 0, R, h ^ (0, oo) such that > C(l + R)'', and x G B]^. 

Proof. Fix n G N, choose Ci, C2, pi as in the lemma above, and let 

j 



v(l-i4'h/4)"-^ 



exp -C2/i^(l -i^V4)"" 



1=0 



for j = 0,1, . . . ,n. If h ^ > Ci(l + i?)^^ then by Jensen's inequality and f l6.2l) . 



[M,+i|J-,] < (g,/)(X,)i-^'V4exp -^2/15^(1 - Kh/Ay 



1=0 



< M, 



on {Xj G -B^} for any j < n. 



Hence the stopped process (AfJ^)o<j<n is a supermartingale, and thus 

[Mt^; n - m < Tij < n] < E^ [Mq] for any < m < n. 
Noting that Mq = /(x)^^-^''/^)" = exp ((1 - Kh/A)''K\\x\\l/lQ), and 

Mt, > (/(X^Jexp(-4C2/i^))(^-^'^/^)"-"^ 

(—R^ _ 4C2/i^) • (1 - A'/i/4)""^« 



= exp 
we obtain the bound 
Fx[n—m < Tn < n] < exp 



16 



Kh 



K 



1- — ) (y^((i-— r™ii^r--^^) + 



4C2 



for any < m < n provided R^ > 64C2/i^^. In particular, if mKh/2 < log(3/2) 
then (1 - > exp{-mKh/2) > 2/3, and hence 

K[n-m<TR<n] < exp(4C2i^) ■ exp - i?^)/24) . 

The assertion follows by partitioning {0, 1, . . . , n} into blocks of length < m where 
m = L21og(3/2)/s:-i/i~iJ. □ 

7. Proof of the main results 

In this section, we combine the results in order to derive the contraction prop- 
erties for Metropolis Hastings transition kernels stated in Theorems 11.151 [1.12l and 
11.191 and we finally prove fL20l Note that for x,x G M.'^, the distances 

(7.1) \\Y^^{x) -Y^^{x)\\^ = {l-h/2) \\x-x\\^, and 

(7.2) \\Yhix)-Yhix)\\^ = \\x-x-iVUix)-VUix))h/2\\^ 

are deterministic. We now combine Lemma [2^ with the estimates in Propositions 
OandOJ 
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Proof of Theorem ITIR We fix /i G (0,2), R G (0,oo), and x,x e B^. By the 
basic contractivity lemma 12.41 and by (12 .Sp respectively, 

¥.[\\Wh{x)-WH{x)\\-] < 
- (1 - E[max(l - ah{x,Yh{x)), 1 - ah{x,Yh{x)))]) ■ {\\x - - \\Yh{x) - Yh{x): 

+ E[max(||x - Fft(x)||_, \\x - Yhix)\\^Y]'^' ■ E[(a;,(x, ^(x)) - a^ix, Yhix)))T^- 
By Proposition 13.3^ 

\\Yhix)-Yh{x)\\. < {l-Kh/2 + M{Rfhy8)-\\x-x\\., 
and by Lemma [4.41 (2), 

E[max(||x-n(x)||_, ^-^(J) ||_) < ml^^h^/^+m8.x{\\VU{x)\\^, \\VU{x)\\^)h/2. 

The assertion of Theorem 11.151 follows by combining these estimates with the 
bounds for the acceptance probabilities in Propositions 11.71 and 11.91 □ 

The corresponding bound for Ornstein-Uhlenbeck proposals follows similarly 
from Lemma 12.41 and Proposition 11.111 

Proof of Theorem \1.12[ We again fix G (0, 2), i? G (0, oo), and x,x E -B^. Since 
Y^^{x)-Y^^{x) = {l-h/2){x-x) and \\x -Y^^{x)\\^ < \\x\\.h/2 + \\Z\\_Vh, 
the basic contractivity lemma [231 implies 



E 



\W^''i^)-W^''{x)\\_ 



h _ 



2" 



X - E [max(l - a'^'^ix, Y^'^ix)), 1 - ^^"(J, Y^'^ix)))] 
max(||x||_, + v^m2/')E[(a°U(x,Y;pU(^)) _^ou^~ yOU^~)))2ji/2_ 



The assertion of Theorem 11.121 follows by combining this estimates with the bounds 
for the acceptance probabilities in Proposition 11.111 □ 

Proof of Theorem \1.19\ . Noting that 



\x\ 



{2RY < -3R' for any x G 5^, 



the assertion is a direct consequence of Corollary 11.181 and Theorem 16.21 applied 
with R replaced by 2R. □ 

Let fiR = fi{-\B]^) denote the conditional measure on 5^. The fact that /ir is 
a stationary distribution for the Metropolis-Hastings transition kernel qh can be 
used to bound the Wasserstein distance between fiRqJ^ and fiR. 

Lemma 7.1. For any R> and a G (0, 1), 

mRi^RQl^R) < 8RifXnq]im'\B],) 

< 8(1 - ay' W^RifiRQl 5oql) + m {5oql){B-n) . 
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Proof. The distance induced by the total variation norm || ■ \\tv is the Wasserstein 
distance w.r.t. the metric d{x,y) = I{x^y}- Since d2R{x,y) < 4:Rd{x,y), we obtain 

^2R{f^Rqh, IJ^r) < 4:R\\fiRq]!^ - finWTv = 8R\\{fXRq]l - fiRyWrv 

(7.3) < 8R{finq]l){R''\B-). 

Here we have used in the last step that fiqh = /i, and hence 

il^RqJim < {finq-)iAnB],) + {f,nq-)iR''\B],) 
< /i«(A) + (/i^,g,")(M'^\5^) 
for any Borel set A C M*^. Moreover, for a G (0, 1), 

(7.4) W2Rimql ^oql) > (R- aR) ■ {{^^Rqlm' \ B],) - {Soq^im" \ B-^)) . 
Indeed, for any coupling r]{dxdx) of the two measures, 

v{d2Rix,x) >R-aR) > ifiRq^^m" \ B^) - {doq^m" \ B~^). 
The assertion follows by combining the estimates in (17. 3p and (I7.4p . □ 

Proof of Theorem \1.2(K By combining the estimates in Theorem I1.19[ Lemma 17.11 
with a = 6/7, and Theorem 16. 2 [ we obtain 

mRii^qlfiR) < mRi'yqh,f^Rqh) + ^2R{fiRqh,f^R) 

< (1 - j/i)">V2/?(i^,/iij) + DRexp{-KRy8)nh 

+56- (1 - j/i)"W2i?(/iR,5o) + 56DRexp{-KRy8)nh + 8RFo[TeR/7 < n] 
K 

< 58i?- (1 - — /i)" + 57 DRexp{-KR^/8)nh + 8DRexp{-KR^/33)nh. 

□ 
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